Real time monitoring of COVID-19 intervention effectiveness through contact tracing data

Communities worldwide have used vaccines and facemasks to mitigate the COVID-19 pandemic. When an individual opts to vaccinate or wear a mask, they may lower their own risk of becoming infected as well as the risk that they pose to others while infected. The first benefit–reducing susceptibility–has been established across multiple studies, while the second–reducing infectivity–is less well understood. Using a new statistical method, we estimate the efficacy of vaccines and facemasks at reducing both types of risks from contact tracing data collected in an urban setting. We find that vaccination reduced the risk of onward transmission by 40.7% [95% CI 25.8–53.2%] during the Delta wave and 31.0% [95% CI 19.4–40.9%] during the Omicron wave and that mask wearing reduced the risk of infection by 64.2% [95% CI 5.8–77.3%] during the Omicron wave. By harnessing commonly-collected contact tracing data, the approach can broadly provide timely and actionable estimates of intervention efficacy against a rapidly evolving pathogen.

Finally, we take the expectation with respect to W of both sides. Note that the expectation is outside the ratio, unlike Equation 1.
However, for a Poisson regression these will be approximately equal, leading to the following estimator for Equation 1.
As an alternative to Poisson regression, we can instead directly model the CRR using the parametric g-computation formula. We choose a parametric logistic regression model for E(Y|A, W), which implies, We next take the empirical average over observed covariates values.
This leads to a natural g-computation based estimator of Equation 1 by taking the ratio of Equation 12 and 13.Ĉ However, Equation 14 is only a point estimate ofĈRR. To reflect uncertainty in our estimate we adopt a Bayesian approach [1]. We replace parameters (θ, θ A ) with random variables under a specified prior distribution, in our case N(0, σ 2 ). We now use the notation CRR(θ, θ A ) to make it clear the causal risk ratio is a function of the random variables θ, θ A in a Bayesian context.
We view Equation 15 as a transformation of the posterior distribution of (θ, θ A )|Y, A, W. Therefore, given a sample from the posterior distribution, denoted (θ * , θ * A )|Y, A, W, we can generate a sample of the posterior over CRR(θ, θ A ).

Simulation Study
To demonstrate the validity of this approach, we perform the following simulation study.
• Sample n individuals.
• For each individual jointly sample their intervention (A) status and their binary cofounder status (W) under a pre-specified correlation ρ and pre-specified marginal probability p(W = 1).
• For each individual simulate test status (Y) as follows, • As P(W = 1) is known, compute the true intervention effectiveness (E A ) • Estimate E A using two common approaches (Poisson regression and targeted maximum likelihood) and Bayesian parametric g-computation.
The Bayesian parametric g-computation estimator had lower mean-squared error across almost all values of true intervention effectiveness when compared to a Poisson regression model ( Figure 1A). The improvement was particularly pronounced at intervention effectiveness levels above 25% and less than 95%. However, the aboslute value of the MSE remained small (< .02) suggesting that Poisson regression models are still relatively unbiased for intervention effectiveness. However, the Bayesian g-computation estimator was much better calibrated than the Poisson regression based estimator ( Figure 1B), confirming the result that Poisson regression based estimators of risk ratios from binary data are not well calibrated [3].
When compared to a leading causal estimator, targeted maximum likelihood (TMLE), Bayesian g-computation performs similarly both on MSE and coverage. When all covariates, treatments, and outcomes are binary, the addition of flexible machine learning models used in TMLE may not significantly improve performance. 4

Effect of Network Structure and Missigness on Vaccine Effectiveness Estimation
We begin with a description of the simulation study used to quantify the bias introduced only by homophily and heterophily (code available from https://github.com/ gcgibson/ve_simulation). We then expand the simulation to include preferential missingness of unvaccinated individuals who were more likely to test positive. To simulate an epidemic on a network we perform the following steps. For a given N individuals we perform the following simulation under complete observation.
• Randomly draw a contact network of size N x N from the power law graph generator in the igraph package, assuming power law contact networks are representative of the actual contact network.
• Randomly choose K individuals to be the initially infected index cases OR choose initially infected based on degree distribution.
• Randomly assign vaccination status to each of the K index cases OR choose initially infected vaccination status based on degree distribution.
• For each contact of each index case, randomly assign the contacts vaccination status according to a correlation parameter ρ. This induces either homophily (ρ > .5) or heterophily (ρ < .5).
• Simulate a single step of an epidemic. Record the successful and attempted transmission pairs i,j along with their vaccination status as contact tracers would do in real time.
• Compute the unadjusted estimator (raw vaccine effectiveness) and our estimator in Equation 3 of the manuscript (adjusted vaccine effectiveness).
We then perform the simulation for a variety of values of the correlation parameter between the index case vaccination status and the contact vaccination status for a given true vaccine effectiveness (Figure 2A). The unadjusted estimator is significantly biased in the negative direction under heterophily and in the positive direction under homophily. However, the adjusted estimator is much less biased. This is because we are explicitly conditioning on the vaccination status of the index case in the adjusted estimator of Equation 3. If unvaccinated individuals preferentially attach to unvaccinated individuals, the regression model will adjust for this.
However, if unvaccinated individuals are more likely to test positive and more likely to be missing there may be significant bias. In order to assess the bias induced in this situation, we augment the simulation study described above and introduce an observation probability on unvaccinated contacts who test positive and are connected to unvaccinated index cases ( Figure 2B). In the context of our study, this observation probability may reflect unvaccinated individuals who engage in risk taking behavior and are therefore more likely to test positive and less likely to respond to contact tracers. When vaccine effectiveness is particularly low (13%) the magnitude of the bias downward significantly increases ( Figure 2B), up to over -100%. We believe this to be the source of the bias of the Omicron estimate of vaccine effectiveness as the percent of successful case contact investigations went down drastically during the Omicron period. The inability to successfully investigate case contact pairs preferentially effected unvaccinated contacts who were more likely to test positive. However, public health officials can monitor the rate of successful follow ups to identify periods when the estimate may become unreliable.